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ABSTRACT 

The goal of this work is to compare the performance 
of response surface methodology (RSM) and two types ot 
neural networks (NN) to aid preliminary design ot two rocket 
engine components. A data set ot 45 training points and -0 
test points, obtained from a semi-empirical model based on 
three design variables, is used tor a shear coaxial injector 
element. Data for supersonic turbine design is based on six 
design variables, 76 training data and IS test data obtained 
from simplified aerodynamic analysis. Several RS and NN are 
first constructed using the training data. The test data are then 
employed to select the best RS or NN. Quadratic and v.uhic 
response surfaces, radial basis neural network (RBNN) and 
back-propagation neural network (BPNN) are compared. I wo- 
1 aye red RBNN are generated using two ditfeient naming 
algorithms, namelv. .solvcrhc and w uvert). A iwn-la\eied 
BPNN is generated with Tan-Sigmoid transfer function. 
Various issues related to the training of the neural networks 
arc addressed, including number of neurons. ,rmr -touts, 
\pvciul constants, and t he accuracy oi diifeicnt models in 
representing the design space. A search foi (he optimum 
desiun is carried out using a standard, giadient-based 
optimization algorithm over the response surfaces represented 
by the polynomials and trained neural networks. I . siiaih a 
cubic polynomial performs belter than the quadratic 
polynomial but exceptions have been noticed. Among the NN 
choices, the RBNN designed using sotverh yields more 
consistent performance for both engine components 
considered. The training of RBNN is easier as it requires 
linear regression. This coupled with the consistency in 
performance promise the possibility ot it being used as an 
optimization strategy tor engineering design problems. 
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Professor and Dept. Chair., Associate Fellow AI AA 
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1 INTRODUCTION 


1 . 1 General Background 

Advanced rocket propulsion systems are being 
proposed to meet goals for increased performance, 
robustness, and safety while concurrently decreasing 
weight and cost. These new goals are forcing consideration 
ot design variables over ranges and in combinations not 
typically employed, thereby increasing the design space 
complexity. Objective and efficient evaluation of these new' 
and complex designs can be facilitated by development and 
implementation ot systematic techniques. Accordingly, 
Response Surface Methodology 1 (RSM) and Neural 
Network" (NN) techniques have been used to generate 
surrogate models representing data obtained from complex 
numerical and experimental simulations. An optimization 
algorithm is then used to interrogate these models for 
optimum design conditions, based on specified constraints. 
!n this studv. the preliminary design issues related to rocket 
propulsion components, including gas-gas injectors and 
supersonic turbines have been investigated. I he objective 
, t this effort is to assess relative performance of RSM and 
NN techniques in representing the design space. 

A pol vnomial-based RSM. in which the design 
space i s represented with quadratic and cubic polynomials 
in the dependent \anables. is used. I he polynomial 
coefficients are obtained by linear regression. 1 he 
maximum or the minimum of the surface can then be 
located using a gradient search method. Response Surface 
methodologies have been used before tor rocket engine 
component design. For example. Tucker et al.~ have used 
RSM for rocket injector design. The approach is not tied to 
anv specific data type or source. The dimensionality of the 
data is not a concern, and data obtained through both 
numerical and experimental methods can be effectively 
used. RSM enables the designer to combine any number of 
design variables for different types of injectors and 
propellant combinations. This generality allows the 
consideration of information at varying levels ot breadth 
n.e.. scope of design variables) and depth (i.e., details ot 
the design variables). 

The RSM is effective in representing the global 
characteristics of the design space and it filters noise 
associated with design data. Depending on the order of 
polynomial employed and the shape of the actual response 
surface, the RSM can introduce substantial errors in certain 
regions of the design space. Shvv et al. 4 have showed it that 
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for a given injector design, a third order response surface 
performs better than a second order surface. Generation ot 
polynomial based surfaces can be costly lor cases involving 
many of design variables due to the amount of data required to 
evaluate the coefficients. In fact, the number of coefficients 
increases rapidly with the order ot polynomial. For example, a 
complete second-order polynomial ot N design variables has 
(N+i )(N+2)/(2!) coefficients. A complete cubic model has 
(N+ 1 ) (N+2)(N+3)/(3!) coefficients. The choice of order of the 
polynomial and the terms to be included depends on the 
design problem. Many combinations ot terms may have to be 
tried to represent the design space before the best one can be 
selected. 

An optimization scheme requiring large amounts of 
data and evaluation time to generate meaningful results is of 
limited value. While the preliminary designs can be 
accomplished with empirically based information, detailed 
designs often require use of data from experiments and/or 
computational fluid dynamics tCFD) analyses. 1 his data can 
he time consuming and expensive to generate in large 
quantities. Recently, NN have been used to represent the 
models instead of the more typical polynomial RSM. Work in 
the area of NN by Shyy et al. 4 and Papila et aif have shown 
that some NN can perform well even when a modest amount 
of data is available. In particular radial basis neuial networks 
tRBNN) like polynomial based RSM require only linear 
regression for training and have proven to be particularly 
accurate. Norgaud et al." and Ross et al. have investigated the 
feasibilitv of reducing wind tunnel test times by using NN to 
interpolate between measurements and demonstrated cost 
savings. These works have focused on using the NN to predict 
data. .Attempts to use the network as a function evaluator and 
then to link it to the optimizer have been made by Protzel et 
ul. s . Rai and Madavan’ and Greenman and Roth"'. 

NN are highly flexible in functional form and hence 
can offer significant potential for representing complex 
junctions. Networks, like RBNN. that are flexible and employ 
linear regression methods can use both of these properties to 
improve the performance. I he number o! neurons in the 
network, size of the region over which the neuron is sensitive, 
and the training accuracy of the network are some of the 
parameters that need to be selected in a network. I hese can be 
determined by comparing the performance of NN designed 
with different values of these parameters. Neural networks can 
be effectively used in two ways. First, they can be used in 
conjunction with RSM. In complex regions ot the surtace, the 
NN can be trained using the existing data. The trained NN can 
then be used to generate additional data to augment existing 
data, thus possibly enhancing the accuracy of the surtace in 
that particular area. Such an approach was investigated by 
Shyy et al 4 . This work demonstrated that the NN could indeed 
yield additional information to help generate more accurate 
polynomial-based response surfaces. Second, NN can generate 
data to be used directly in conducting gradient-based 
optimization. In other words. NN can perform the role of 
either enhancing the fidelity ot a polynomial-based response 
' surface, as in the first approach’, or generating information as 
input to an optimizer by itself without resorting to a 
polynomial representation, as in the second approach. Either 


way, the only function evaluations required are for the 
points sought by the optimizer, which searches the design 
space based on the sensitivity of the response to the 
perturbations in the design variables. 

1.2 Scope 

The present work is aimed at a direct comparison 
of the RSM and NN techniques in terms of accuracy and 
efficiency; the hybrid RSM-NN scheme noted above will 
not be used here. Both techniques are applied to data used 
in the design of two rocket engine components: a shear co- 
axial injector and a supersonic turbine. Variations of each 
technique are evaluated. Both second and third order 
polvnomials will be used for the Response Surface (RS). 
Two NN schemes, radial basis and the more commonly 
used back-propagation NNs are used. The same database 
for each component will be used to tram both the RS and 
the NN. Both will then be linked to an optimization 
procedure. There is little rigorous theory in the literature to 
establish the desired framework tor a clear comparison 
between the performances of the two techniques. However, 
this work provides an assessment ot the techniques 
regarding their practical use in the rocket engine 
component design process. 

2 APPROACHES 

2.1 Summary of Analytical Models and Design 
Variables 

Two components ot a rocket propulsion system 
have been considered here, the injector and the turbine, 
first, a shear coaxial injector element that uses gaseous 
oxygen ((7 CM and gaseous hydrogen if///;) as propellants 
is used to investigate the relative performance of RSM and 
NN in the design of rocket engine injectors. The original 
data set from Tucker et af (45 design points) is used to 
generate quadratic and cubic response surfaces tor both, 
energy release efficiency (ERE), a measure ot injector 
performance, and chamber wail heat flux (Q). These 45 
design points are evenly distributed over the design space. 
ERE was obtained using correlations taking into account 
combustor length. L, nmh (length from injector to throat), and 
i he propellant velocity ratio, l /V fl . 1 he nominal chamber 
wall heat flux at a point just downstream of the injector, 
() !wm . was calculated using a modified Bartz equation. It 
was then correlated with propellant mixture ratio, O/F, and 
propellant velocity ratio, V/V tt to yield the actual chamber 
wall heat flux, Q . The accuracy of each polynomial fit on 
the original data set is evaluated. Two different types of 
radial basis NN (RBNN) and a back propagation NN 
(BPNN) are also trained to represent ERE and Q. Each 
surface is then used to conduct design optimization over the 
same range of independent variables. The optimal design 
points are compared with exact points calculated from the 
empirical model of Calhoon et al 11 . The range of design 
variables considered in this study is shown in Table 1. 
Twenty additional data points that are not used in the 
generation of response surfaces or the neural networks are 
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used to assess the accuracy ot different variants of RSM and 

NN. 

The other propulsion system component examined is 
a supersonic turbine where the preliminary design is 
conducted by one-dimensional aerodynamic analysis using 
EpgenML 12 . FpgenML generates a flowpath and runs a 
preliminary meanline calculation on this flowpath. In this 
study, a single stage turbine has been considered. There are six 
design parameters and tour output variables involved in this 
design process. There are 76 design points available for 
training. These 76 points were selected by using a face 
centered composite (fee) design. Instead ot design points, 
as would be provided by a fee design tor six variables, only 76 
were available since the meanline code could not converge tor 
one of the designs. The design variables are the mean 
diameter, D, RPM , blade annulus area, A tmn . vane axial chord, 
C v , blade axial chord, C b , and stage reaction. k r . These are 
parameters influencing the structural properties and 
performance of the turbine. Overall efficiency ot the tut bine, 
/j, turbine weight, \K a lumped inertia measure. M V) M.,,,,, -*• 
(RPM) i 2 ) and speed at pitchline. V 7 i D '< RPM) are chosen as 
dependent variables, the goal is to maximize the inciemenial 
payload [Apax). which is derived trom turbine weight (U ) and 
efficiency (//). Therefore, the objective is a design where U is 
minimized and rj is maximized. Due to the structural 
considerations, constraints have to be imposed on i AN ) and 

Usimz 18 additional simulations, distributed within 
the design space, the accuracy <>f the models is tested. 1 he 
ramies considered tor the design variables and the dependent 
variables are shown in Fqs. ( 1 ) and (2). 

l or the design variables: > 

1.496 > D > 0.0502 
1.4 > RPM > 0.6 

1.3 > A lintl > 0.699 > 11 

1.706 >C> 0.394 

1.143 > (St, > 0.264 | 

0.0 > k r > 0.5 ' 

i ; or dependent variables: 

1 . 1 16 >// > 0.223 

i ).80l > U >0.422 ^ 

3. 197 > (AN) 2 > 0.343 ^ 

1 .849 > V ptu . h > 0.0484 


used to represent the composite function. For example. 
Tucker et al 3 * * used a geometric mean to combine their two 
objectives, ERE and Q. The composite desirability is of the 
form 



(3) 


where D is the composite objective function, d/s are 
normalized values of the objective functions and / is the 
number of objective functions. 

Another way of constructing a composite function 
is to use a weighted sum of the objective functions. The 
composite desirability function can then be expressed as 

d = JV/; W 

, i 


where D is the composite objective function and f/s are the 
non-normal ized objective 1 unctions. The (X t s are 
dimensional parameters that control the importance ot each 
objective function. 

For the injector, the goal is to maximize the 
energy release efficiency, ERE while minimizing the 
chamber wall heat flux. Q. This is achieved by maximizing 
a composite objective function given by hq (5). 


/) - U/r, ) - 


where the normalized functions are defined in Fqs. (6) and 
. 7 1 . In the ease where a response should be maximized, 
Mich as ERE. the normalized function takes the form: 


J t ; ililil- - for s ERE < B (6) 

B-A 

where B is the target value and A is the lowest acceptable 
value. We set d hRE = / for any ERE > B and d ERE = 0 for 
ERE < A. The choice of .v is made based on the subjective 
importance ot this objective in the composite desirability 
function. In the case where a response is to be minimized, 
such as Q , the normalized function takes on the form: 


All the variables involved in the design process are 
normalized by their respective baseline values. 


— j for C < ERE < E 
E-C J 


(7) 


2.2 Objective Functions 

When attempting to optimize two or more different 
objective functions, conflicts between them arise because ot 
the different relationships they have with the independent 
parameters. To solve this problem, a multi-objective 
approach is investigated in this study. Here, competing 
objective functions are combined to a single composite 
objective function. The maximization of the composite 
function effectively provides a compromise between the 
individual functions. An average of some form is normally 


where C is the target value and E is the highest acceptable 
value. We set d Q = / for any Q < C and d Q = 0 for Q > E. 
A B , C and E are chosen according to the designer’s 
priorities or, as in the present study, simply as the boundary 
values of the domain of ERE and Q. The value of t is again 
chosen to reflect the importance of the objectives in the 
design. In the study A and B are equal to 95.0 and 99.9, 
respectively. Values of C and E are equal to 0.48 and IT, 
respectively. Both s and t were set to a value ot 1. 
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In the case of the turbine, a weighted sum ot the two 
objectives r} and W has been used. The expression, in the 
context of the turbine gives the incremental value ot the 
payload with the change in W and rj. The goal is to maximize 
this incremental value, which in turn results in minimum W 
and maximum r). 

D = Apay = C } xI00x(rf- r} b )-C : x(W-W h ) (8) 

where C/ = the amount of payload increment capacity tor 
any efficiency gain 

C; = the amount of payload increment capacity tor 
any weight gain 
7] - the calculated efficiency 
rj b - the baseline efficiency 
\V= calculated weight 
\V b ~ the baseline weight. 

The baseline efficiency and weight are obtained using 
existing design knowledge without benefiting Horn an 
optimization strategy. The weight associated with 7] expressed 
in percentage, by multiplying it with 100. is (' and the weight 
associated with W is C\ This relationship is developed based 
on detailed turbopump design processes, for one percent 
increase in efficiency a payload increase ol (./lbs can be 
achieved, and as the weight of the turbine increases the 
payload has to be correspondingly decreased by a factor o! C : . 

2.3 Response Surface Methodology (RSM) 

Polynomial RSM constructs polynomials of assumed 
order and unknown coefficients based on regression analysis. 
The solution tor the >>et ot coefficients that best tits the 
training data is a linear least square problem. I he number ot 
coefficients to be evaluated depends on the order ot 
poly nomial and the number of design parameters involved. 

According to the injector model developed b\ 
Galhoon et al [1 , injector performance, as measured by ERE 
depends only on the velocity ratio, \ / V',, and combustion 
chamber length, L romb . Therefore, only 15 distinct design 
points are available for ERE. Since chamber wall heat flux 
depends only on the velocity ratio. V A and the oxidizer to 
fuel ratio. O/F. there are 9 distinct design points for Q 1 he 
design space for this problem is depicted in figure 1 . for ERE. 
the 5 distinct chamber lengths offer the potential for a fourth- 
order poly nomial fit in while the three different velocity 

ratios limit the fit in V/V„ to second order. Quadratic and 
cubic response surfaces for both ERE and Q have been 
generated for evaluation. The above-noted limitations on the 
data, limits the cubic surfaces to be third order in only. 

As already mentioned, lo construct a complete 
quadratic polynomial ot N design variables, the number ot 
coefficients required is (7V+/ )(N+2)/(2!l In the turbine case 
with 6 design variables, yve would need to estimate 28 
coefficients, A complete cubic mode! would require 
(N+hlN+2)(N+3)/(3!) or 84 coefficients and four levels. 
Since the data available is^not* sufficient to evaluate all the 
cubic terms, reduced cubic models are employed. 

The response surfaces yvere generated by standard 
least-squares regression using JMP 14 . a statistical analysis 


software package. JMP is an interactive, spreadsheet-based 
program having a variety of statistical analysis tools. 
Statistical techniques are also available for identifying 
polynomial coefficients that are not well characterized by 
the data. A stepwise regression procedure based on t- 
statistics is used to discard terms and improve the 
prediction accuracy. The t-statistic, or t-ratio, of a particular 
coefficient is given by the value of the coefficient divided 
by the standard error of the coefficient, which is an estimate 
of its standard deviation. The accuracy of different surfaces 
at points different from the training data can be estimated 
by comparing the adjusted root mean square error defined 
as: 



Here e, is the error at i th point of the training data, n is the 
number of training data points and n, } is the number of 
coefficients. When the data contains uncorrelated Gaussian 
noise. (J, provides an unbiased estimate of that noise. Even 
when the error is not solely due to noise O', provides a good 
overall comparison among the different surface tits. 

The accuracy of the models in representing the 
objective functions is also gauged by comparing the values 
of the objective function at test design points, different 
from those used to generate the tit. The root mean square 
error, rr. for the test set is eiven hv: 



\ ' 

V ni 


In this equation f, is the error at the i" test point and m is 
the number ot test points. 

2.4 Neural Networks 

Tyvo different types of NN have been used, 
namely radial basis 1 " and back-propagation 1 ". The training 
process of the network is a cyclic process and the weights 
and biases of the nodes of the network are adjusted until an 
accurate mapping is obtained. This trained netyvork can 
then predict the values of the objective tor any new set ot 
design variables in the design space. The neural network 
toolbox 1 " available in Mat lab is used tor the current 
analysis. 

2.4.1 Radial Basis Neural Networks (RBNN) 

Radial-basis neural networks are two-layer 
networks with a hidden layer of radial-basis transfer 
function and a linear output layer (Figure 2). RBNN 
requires large number of neurons, depending on the size of 
the data set. but they can be designed in a small amount of 
time. This is due to the fact that the process of determining 
thewveights associated with the large number ot neurons 
uses linear regression. Thus, they may be efficient to train 
when there are large amounts of data available for training. 
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The transfer function tor radial basis neuron is 
rcidbcis , which is shown in Figure 2b. Radbas has maximum 
and minimum outputs of l and 0. respectively. The output ot 
the function is given by 

a = radbas [dist ( tv. p ) x b ) ( 1 * > 

where radbas is the transfer function, dist is the vector 
distance between the network weight vector, w and the input 
vector, p, and b is the bias. In a radial basis network (Figure 
2a) each neuron in the radbas hidden layer is assigned 
weights, Wi which are equal to the values ot one ot the training 
input design points. Theretore, each neuron acts as a detector 
for a different input. The bias tor each neuron in that layer, bf 
is set to 0.8326 /jc, where sc is the spread constant, a value 
defined by the user. This defines the region ot influence by 
each neuron. The whole process is then reduced to the 
evaluation of the weights, ho, and biases, b : . in the output 
linear layer, which is a linear regression problem. 11 the input 
to a neuron is identical to the weight vector, the output ot that 
neuron is 1, since the effective input to the transfer tunction is 
zero. When a value ot 0.8326 is passed through the uanstei 
f unction the output is 0.5. For a vector distance equal to or less 
than 0.8326//;, the output is 0.5 or more. The spread constant 
defines the radius of the design space over which a neuron has 
a response of 0.5 or more. Small values id sc can result in poor 
response in a domain not closely located to neuron positions, 
that is, for inputs that are lar trom the training data as 
compared to the defined radius, t lie response Mom the iieuton 
will be negligible. Large values will result in low sensitivity ot 
neurons. Since the radius ot sensitivity is large, neurons whose 
weights are different from the input values by a large amount 
will still have high output thereby resulting in a Hat network, 
t he best v alue of the spread constant tor some test data can be 
found by comparing a tor networks with dilterent spread 
< onstants. 

In Matlab , radial-basis networks can he designed 
using two different design procedures, solverhe and solverh . 
Solverbe designs a network with zero error on the training 
vectors by creating as many radial basis neurons as there are 
input sets. Therefore, solverhe may result in a larger network 
than required and map the network exactly, thereby tilting 
numerical noise. A more compact design in terms ot network 
oze is obtained from solverh , which creates one neuron at a 
ume to minimize the number ot neurons required. At each 
epoch or cycle, neurons are added to the network till a user 
specified RMS error is reached or until the network has the 
maximum number ot neurons possible. The design parameters 
for solverh are the spread constant, a user defined RMS error 
» oaL and the maximum number ot epochs whereas it is only 
the spread constant lor solverbe . 

In case of the injector design there are two objectives, 
namely ERE and Q and for turbine the objectives are r} and IV. 
Figures 3 and 4 give the variation of a for the network design 
with solverbe for the objective functions of the two engine 
- components. In case of solverh* the error goal during training 
also defines the accuracy of- the network. An objective oi 
t ittiniz a numerical model is to remove the noise associated 


with the data. A model, which maps exactly as solverbe 
does, wall not eliminate the noise, whereas solverb will. 
Figures 5 and 6 give the variation of a for the network 
design with solverb tor the objective functions of the two 
engine components. 

By comparing Figures 3-6 it can be seen that for 
low values of spread constant the NN network has a poor 
performance. As the spread constant increases a 
asymptotically decreases. However, as demonstrated by 
Figure 5a the performance of the network can deteriorate 
for higher values of the spread constant. The region with a 
large variation in a is highly unreliable because this 
indicates a high sensitivity of the model to a small variation 
ot spread constant and possibly the test data, in this region. 
Hence the desirable spread constant is selected from the 
region where the performance of the network is relatively 
consistent. 

Figures 5 and 6 also show the influence of error 
t >oal on the network. Generally it a network maps the 
training data accurately it can be expected to perform 
efficient I with the test data. However, accurately mapping 
noisy data mav result in poor prediction capabilities for the 
network. The variation in the performance is not significant 
except for the ERE and Q network (Figure 5), w'here the 
poor performance of the network at high values of spread 
constant improves for a larger error goal. This may 
indicate the presence ot noise in the data lor ERE , which 
solverh is able to eliminate with an appropriate error goal. 
figure 7 shows variations in number of epochs and (7 with 
the variation of error goal tor a given spread constant 
when RBNN is designed with solverb. The number of 
neurons in the network is one more than the number ot 
epochs. One expects that as the error goal increases the 
number ot epochs becomes smaller and the network 
performs less accurately as in Figures 7a and 7b. However 
as demonstrated bv Figures 7c and . d . a more stringent 
error goal for the training data does not necessarily result 
in better predictive capability against the test data. Less 
accurate network can be designed lor these objectives, 
which have smaller prediction error. 

When choosing an appropriate network the above- 
mentioned features have to be considered. The performance 
of [he constructed NN is best judged by comparing the 
prediction error as given in Eq. ( 10), lor different networks. 
Using solverbe , networks are designed with varying spread 
constants and the one that yields the smallest error is 
selected. When solverb is used, networks are designed for 
different spread constants and error goals. The network 
that gives the smallest error tor the test data is used. The 
details of the networks selected are discussed in later 
sections. 

2.4.2 Back-propagation Neural Networks (BPNN) 

Back-propagation networks are multi-layer 
networks with hidden layers ot sigmoid transfer function 
and. a linear output layer (Figure 8). The transfer function in 
the hidden layers should be differentiable and thus, either 
!og-sigmoid or tan-sigmoid functions are typically used. In 
this study, a single hidden layer with a tan-sigmoid transfer 
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function, tansig, (Figure 8b) is considered. The output ot the 
function is given by 

a = tansig ( nCp + b) ( ^ * 

where tansig is the transfer function, u- is the weight vector,/? 
is the input vector and b is the bias vector. The maximum and 
minimum outputs of the function are i and -1 . respectively. 

The number of neurons in the hidden layer of a back- 
propagation network is a design parameter. It should be large 
enough to allow the network to map the tunctional 
relationship, but not too large to cause overtitting. Once it has 
been chosen, the network design is reduced to adjusting the 
weight matrices and the bias vectors. Since tor BPNN the 
unknown weights are in the nonlinear function, the training 
process requires nonlinear regression, which is an 
optimization process. This optimization is usually performed 
using grad 

lent methods. In Mat lab. back-propagation networks can be 
trained by using three different training Junctions, uaitibp . 
trainbpx and trainbn . The first two are based on the steepest 
descent method. Simple back-propagation with nambp is 
usually slow since it requires small learning rates for stable 
learning. Trainbpx , applying momentum or adapti\e learning 
rate, can be considerably faster than miinbp , but tramlm , 
applying Levenberg-Marquardt optimization' \ is the most 
efficient since it is based on a more efficient optimization 
algorithm. 

The design parameters lor trainnn are the number ot 
neurons in the hidden layer, a user defined error i*oaL and the 
maximum number ot epochs. The training continues until 
either the error gou/ is reached, the minimum enot giadient 
occurs or the maximum number of epochs has been met. 

For BPNN, the initial weights and biases are 
randomly generated and then the optimum weights and biases 
are evaluated through an iterative process. The weights and 
biases are updated bv changing them in the direction ot down 
slope with respect to the sum-squared error of the network, 
which is to be minimized. The sum-squared error is the sum ot 
the squared error between the network prediction and the 
actual values of the output. In BPNN i Figure 8a) the weights. 

and biases. b h in the hidden tansig layer are not fixed as in 
the case of RBNN. Hence, the weights have a nonlinear 
relationship in the expression between the inputs and the 
outputs. This results in a nonlinear regression problem, which 
takes a longer time to solve than RBNN. Depending upon the 
initial weights and biases, the convergence to an optimal 
network design may or may not be achieved. Due to the 
randomness of the initial guesses, if one desires to mimic the 
process exactly for some purpose, it is impossible to re-tram 
the network with the same accuracy or convergence unless the 
process is reinitiated exactly as before. The initial guess of the 
weights is a random process in Matlab. Hence to re-train the 
network the initial guess has to be recorded. 

The architecture is decided based on past experience 
with similar kind of dataset. For a given objective the error 
noal is fixed and the number of hidden layer neurons are 
varied between 2 and the total number of inputs. Each network 
is retrained few times so as to start the search from random 


initial weights and biases. The networks that do not achieve 
the error goal are discarded. Among the converged 
networks the selection of the best network is made based on 
the value of a. The goal is to attain as low a value for a as 
possible. The number of neurons in the hidden layer is 
increased one at a time till the error goal is achieved and a 
small value of cr is obtained. Although this method may not 
be the best way to obtain the best BPNN, it is considered 
adequate for the current study. At times larger network has 
a high value of a which maybe due to overfitting of the 
design space. To prevent the model from converging to a 
local minimum, an iterative method is used as suggested by 
Stepmewski et al 16 . The obtained network is retrained with 
initial weights obtained by perturbing the weights of the 
obtained network. 

w = u;, + Any, (13) 

where w is the initial weight vector tor the network to be 
trained. w n is the weight vector ot the obtained network. A 
is the level of perturbation (0.1 ) and r is a matrix ot random 
numbers between -1 to 1. 

2.5 Design Optimization Process 

The entire optimization process can be divided 
into two parts: 

1) RS/NN training phase for establishing an 
approximation, 

2) Optimizer phase. 

In the first phase, RS or NN are generated with the 
available training data set. In the second phase the 
optimizer uses the RS/NN during the .search tor the 
optimum until the final converged solution is obtained. The 
initial set of design variables is randomly selected from 
within the design space. The flowchart of the process is 
shown in Figure 9. 

The optimization problem at hand can be 
formulated as }sut>jcct to lb < x < ub , w'here lb is 

the lower boundary vector and ub is the upper boundary 
vector of the design variables vector x. If the goal is to 
maximize the objective function then J(x) can be written as 
-eFvj, where g(x) is the objective function. Additional linear 
or nonlinear constraints can be incorporated it required. 
The present design process does not have any such 
additional constraints. The optimization toolbox in 
Matlab used here employs a sequential quadratic- 
programming algorithm. 

3 RESULTS AND DISCUSSION 

The RS and NN are constructed using the training 
data. The test data is then employed to select the best RS or 
NN. Specifically in RSM. the difference between the RS 
and the training data, as given by Eq. (9), is normally used 
to judge the performance of the fit. The additional use of 
thelest data helps to evaluate the performance of different 
polynomials over design points not used during the training 
phase. This gives a complementary insight into the quality 
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of the RS over the design space. For both the rocket engine 
components, different polynomials were tried. Table 2 
compares the performance of different polynomials used to 
represent the two objective functions of the injector case, ERE 
and Q . Starting with the all the possible cubic terms in the 
model, revised models are generated by removing and adding 
terms. Similar kind of analysis is also done for the turbine 
case. The best polynomial is selected based on a combined 

evaluation between o a and cr. 

For the NN, the test data helps evaluate the accuracy 
of networks with varying neurons in BPNN and varying 
spread constant in RBNN. Thus the test data ate part of the 
evaluation process to help select the final NN. Based on the 
RSM or NN model, a search for optimum design is carried out 
using a standard, gradient-based optimization algorithm over 
the response surfaces represented by the polynomials and 
trained neural networks. 

3.1 Shear-Coaxial Injector 

According to the available data, the injector 
performance, ERE . depends only on the velocity ratio. TA,., 
and combustion chamber length, which indicates 15 

distinct design points for ERE. The chamber wall heat I lux. (J. 
depends on velocity ratio. V/V,, and oxidizer to fuel mtio. 
O/F. and has nine distinct points. For ERE , as seen from 
Figure 1, five distinct levels tor L ioml , offers the potential for a 
fourth-order polynomial fit in the same, while three different 
velocity ratios and oxidizer to fuel ratio limit the til in these 
variables to second order. 

A reduced quadratic and an incomplete cubic 
response surfaces are used for the two objective functions. The 
first model in Table 2a and the sixth model in T able 2b are the 
>eiected cubic models for ERE and (J. respectively, i here is 
no noticeable improvement among the remaining cubic model 
for ERE . For Q , the selected model is the best in terms ol cr. r , 
although there are other models with identical value of (J. 

ERE = 70.43 -r 1.580V', IV r 6.208T ,,, ,, -0.190(1'. ,T )/., 

-0.33 UL., m J (l4) 

n = 0.479 - 0.046(2/ E + 0. 1 9 IT IV. - 0.009 < ( )/ E r 
-0.02S 10/ F)V. /v; t ! 

ERE = 50.059 + 3.758V, /V : f 14.573L, W „ -0.05(V IV , r 
-0.111 {V ; f V, )L t , mb - 1 .459( L t . imb y + 0.002< V, IV y L, mh 
+0.046T, !V ,{L iHnh ) : + 0.047 ( L. omh ) 3 (16 ) 

Q = -0.566-0.3580/ /'+0.383V, IV -OM9UOIFY 
-0.107(0/ F)V f IV n -0.003(V; IV, f + 0.00 5(Of Fyv, IV, 
+0.002 (0/F)(V f ivyy < l7 > 

Equations (14) and (16) are the reduced quadratic 
responses and Fqs. ( 16) and ( 17) represent the reduced cubic 
polynomials used for the two objective Junctions. 1 ho t- 


statistics for the coefficients in Eq. ( 14) vary between 49.30 
and 8.06. For the coefficients in Eq. (15), they vary 
between 6.28 and 0.52. In Eqs. (16) and (17), the t-statistics 
of the coefficents vary between 14.69 and 0.31 and 3.36 
and 0.74, respectively. 

The radial basis networks designed with solverbe 
are the largest with 15 neurons in the hidden layer for ERE 
network and nine neurons for the Q network. Solverb 
designs a network for ERE with 14 neurons in the hidden 
layer and a network for Q with eight neurons. Compared to 
RBNN. BPNN has fewer neurons, the number of neurons in 
the hidden layer are eight and four for the ERE and Q 
networks, respectively. Details of the networks used are 
listed in Table 3. The spread constant used for RBNNs and 
the error goal of the training data is also given in Table 3. 
The spread constant values are selected from the region 
where the performance of the network is consistent with the 
variation of spread constant (Figures 3-6). The error goal 
in the case of solverb, is selected based on the network with 
the best performance for the ideal spread constant (Figure 
7). 

The error in predicting the values of the objective 
(unction bv different schemes is given in Table 4. Several 
observations can be readily made. 

1. Both NNs perform better than the RSM for this data 
set. 

2. Both solverbe and solverb are of comparable 
performance. 

3. The BPNN helps generate smaller networks and 
performs at par in comparison to RBNN. 

4 The cubic polynomial is more accurate than the 
quadratic one. 

The various models generated are compared with 
test data in Figures 10 and 1 1. The curves representing the 
NN predictions are closer to the data obtained from the 
injector model than the RSs thereby demonstrating that NN 
models are able to predict better than the RSs. BPNN 
performs as well as RBNN but tends to be Hat. Due to its 
lower order, ihe quadratic polynomial is Hat. The cubic 
polynomial is able to perform better than quadratic. 

The optimum solution obtained from various 
.•schemes is shown in Table 5 and Figures 12 and 13. The 
aim is to maximize ERE and minimize (J . The trend of the 
objective functions in the design space is monotome and 
hence every model is able to select identical optimum 
design for the given constraints. Ihe flatness of the 
polynomials results in bad predictive values of the 
objective function for the optimum design. The cubic 
polynomial is more flexible than quadratic but is not 
consistent. For a V/V„ constraint ot 4 the quadratic 
polynomial is more accurate but tor higher values of V/V a 
the cubic polynomial is more accurate. In contrast, the NN 
models are able to perform well. Since the optimum design 
happens to be the same as one of the training points, 
solverbe is able to predict the values of the objective 
function accurately. Solverb performs equally well, thereby 
showing the capability of performance with fewer neurons. 
Performance of BPNN is not as satisfactory as suggested in 
Table. 4. For lower constraints of TAT, it performs poorly 
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but for higher values of V/V () it is good. 1 his may be due to 
the selection of fewer neurons in the hidden layers ot the 
networks. Overall, it is still better than to the RSM and 
demonstrates the flexibility ot NN over RS. 

As stated by Papila et af\ when it comes to choosing 
between NN and polynomials, polynomials are easy to 
compute. The number ot coefficients might be numerous but 
the linearity of the system expedites the process of coefficient 
evaluations. This is also the reason RBNN train fast. On the 
other hand, the weights of BPNN are evaluated through a 
nonlinear optimization, which slows the training process. Ot 
all the NN presented here, the one designed with the help ot 
solver be is the fastest to tram since the values ot the weights 
are set to values of the input dependent variables. Solverb 
trains with the addition of one neuron at a time with weights 
similar to the input and hence is slower. 

3.2 Supersonic Turbine 

The generation of RS and the training of the NNs are 
done with the 76 design points in Table 5.V The analysis was 
initially done without the constraints and then with the 
constraints on (AN)" and V2, ;i , f .^. 

A quadratic RS was initially generated. I hen, cubic 
terms were included. Cubic terms that aie products ot thiee 
different variables were included because ot the number ot 
data available and the number of levels being three. I he trend 
of the design data also suggests the presence ot some ot these 
terms. Therefore, the initial cubic equation has 45 terms. A 
reduced third order RSs for q and W was selected based on the 
relative performances ot ditlerent polynomials obtained bv 
removing terms from the initial cubic equation based on t- 
statistics. The cubic equation was selected based on the 
evaluated value of <r r and a. Table 6 suggests that me reduced 
cubic polvnomial is better than the quadratic polynomial since 
( 7 t better for the former. The values ol cr are comparable. 

The t-statistics for the coefficients m the response 
surface of q varies between 179.72 and 1.2. I he coetlicients 
in the response surface of \V have t-statistics varying between 
822.66 and 0.68. The response surfaces tor // and \\ are as 
follows: 

q = 0.654 + 2.9 17D + 5608.2 17 RPM -1.0287.4 
-0.0072962. + 0.0054462, -0.0399 6, -4.282 D' 

- 16283.057 D ■ RPM -\ .512x\0 RPM : + 5.22SD4.,,,, 

- 1 346 1 .8234 RPM + 0.0247C D ~ 1 1 4.46762 RPM 
-0.00647 C2 2 -0.0124C,/ -0.1636 r D-300.4406 r /?PA7 

-0.429* A -0.006086 C -0.003626 62, -0.012867 

o47 1 9.62 DA inn RPM + 3S7.74DC RPM - 1 .743DA 
-0.037 DC\.k r - 5384.729 A 1(1(l RPMk, 

-I \3.S6SC h RPMk r < 18) 

. w = 0.644 + 1 .509 D - 96 $42 RPM 

-0.6274 ... fr -0.00452C -0.0041262, -0.02556 

-3.805D J -7040.35 ID -RPM -2.248 DA , r , t 


-13 0124 : r 0.00856C, D + \ 0JAAC v RPM 
-0.00342C, ’ -0.01 04C h D - 23 .359C,, RPM 
+0.0127C,4, ml -0.006096V -0.06866, D 
+93.5276 RPM -0.221 k r A mn - 0.003246, C v 
-0.001836 ,.C, -0.006736/ +93.193 DC V RPM 
-162. 604 DC, RPM + 92 1 . 05 3 Dk , RP M 
+0.342 DA ann C h - 0.692 DA iltm k r -0.0162 DC v k r 
-11.31 162 RPMk r (19) 

The networks designed with solverb have 37 and 
75 neurons for q and VV2 respectively in the hidden layer, 
while those designed with solverbe has 76 neurons each. 
The BPNN uses significantly less number of neurons by 
venerating networks with five and 60 neurons for q and W , 
respectively, in a single hidden layer. The NN architectures 
chosen are listed in Table 7. 

The accuracy of the various models is tested with 
the data available in Table 6A and the error is shown in 
fable 8. Solverbe has a poor prediction for q . which might 
be due to overfitting, but performs well for IV. The outcome 
nf Table 8 for the supersonic turbine is similar to that of 
fable 4 for the injector, except that BPNN is clearly 
inferior to RBNN. Overall, based on the tw'o cases, it seems 
that solverb is most consistent among all methods 
evaluated. 

The optimum solutions subjected to the 
constraints, ot fA.Yr limited to less than 1.132 (normalized 
with baseline value) and V. 1llrh is limited to less than 1.148 
: normalized with baseline value), are presented in Table 9. 
Since (ANr is proportional to the product of square of RPM 
and A. rm|1 and \’ rl . (V y r is proportional to D times RPM , no 
NN/RS is generated for them. By comparing the predicted 
optimal design bv the various methods, one observes that 
u fiver be and BPNN yield noticeably larger errors in q and 
\\\ respective! v. Solverb and the response surface are more 
consistent with both q and IV. Judged by the error in 
predicting 4 pay. it seems that the RSM is most accurate. 
However, since the real goal is to maximize Apay, it is 
important to note that the actual value ot Apciy for the 
optimal design chosen by the RSM is the worst. Clearly, 
i he lame multiplier in Eq. (8) causes bias in relative 
weiuhtmv between q and IV, which in turn causes different 
'‘apparent” accuracy levels by various methods. 

From a design perspective, it is interesting to 
understand the impact of the constraints from A nnn and Vpu C h 
on the optimal turbine parameters. Such an assessment is 
offered in Figures 14 and 15. As D, RPM and A tmn decrease, 
q, tV, V f)ltrh > AN 2 and Apay decrease. C h and C, are almost 
constant over the design space and they do not have any 
noticeable effect on the objective functions and constraints. 
In the case of Cv, the BPNN shows a small perturbation for 
the analysis with the constraint. This might be due to the 
mapping of some noise by BPNN. Otherwise it is 
unaffected bv the inclusion of the constraints. The stage 
reaction, K r , is unaffected as expected, since we are dealing 
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only with the single stage ot the turbine. Hence there is no 
split on the stage reaction. 

4 SUMMARY AND CONCLUSIONS 

In the present study, the RS and NN are tirst 
constructed using the training data. The test data are then 
employed to assess the performance of various polynomials 
and to offer insight into model improvement by removing and 
adding terms. The best polynomial is selected based on a 
combined evaluation between o n and <J. For the NN. the test 
data helps evaluate the accuracy ot networks with varying 
neurons in BPNN and varying spread constants in RBNN. 
Thus the test data are adopted to help select appropriate RSM 
and NN models. Once an RSM or NN model is constructed, a 
search for optimum design is carried out using a standard, 
gradient-based optimization algorithm over the response 
surfaces represented by the polynomials and trained neuial 
networks. 

Based on the results obtained, we have reached the 
following conclusions. 

1 . Higher order polynomials perform setter than lower order 
polynomials as they have more ilexibility llowevei. 
appropriate statistical measure needs to be taken to 
determine the best terms to include. 

2. In the present study, both NN and RSM can peitoim 
comparable for modest data sizes. 

3. A mom: all the NN configurations. RBNN designed with 
solverb seems to be more consistent in performance tor 
both injector and turbine cases. 

4. Radial basis networks, even when designed efficient!) 
with solverb. tend to have many more neurons than a 
comparable back-propagation with tan-sigmoid ot log- 
sigmoid neurons in the hidden layer. 1 he basic reason tor 
this is the fact that the sigmoid neurons can nave outputs 
wer a large region ot the input space, while radial basis 
neurons onlv respond to relative!) small regions ot the 
input space. Thus, larger input spaces require more radial 
basis neurons for training. 

5. Configuring a radial basis network otten takes less time 
than that for a back-propagation network because the 
training process tor the tormer is a linear in nature. 

(). RBNN with the combined feature ot flexibility and linear 
regression is more accurate than BPNN. which is 
nonlinear. 

Based on the results shown in fables 4 and S. it is seen 
that the RBNN technique performs consistently, and holds 
promise for the design/optimization ot advanced rocket 
propulsion components. The method adopted here to generate 
BPNN is not necessarily the most efficient. Given a better 
method of making the selection of the number of neurons in 
the hidden layer, BPNN, might be able to perform better. 
Future work would be aimed at implementing a better 
designing procedure tor back-propagation networks. The work 
has been carried out with modest data sizes and the training is 
last for such cases. Issues, related to the number ot design 
" variables and training data size are critical for practical design 
applications, and should be addressed in the tuture. 
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Table 1: Ranse of design variables considered tor the shear coaxial injector element. 
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Table 2(a): Different cubic polynomials for ERE. (Dependent variables: Y/V n and L , omb , 15 training points, 
10 test points) (Errors are given in percentages ot the mean value ot the responses). 
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fable 2(b): Different cubic polynormals for (J. ( Dependent variables: O/F and V7\ ., 9 training points. 4 test 
points) (Errors are given in percentages ot the mean value ot the responses). 
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Table 4: RMS errcft* irt. predicting the values of the objective function by various schemes tor the shear 
coaxial injector element (Errors are given in percentages of the mean value of the responses). 
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V/V () 

Scheme 
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Table 5: Optimal Solutions !or lived \ antes ot \ y\ and 
and RSM schemes for the shear coaxial injector element, 
are given in parenthesis tor each prediction is m ( < t 
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Table 0: Training and predicting error lor ditlerent response surfaces ot the objective functions of the 
supersonic turbine. (Errors are given in percentages of the mean value ot the responses) 


Scheme 


RBNN ( Solver be) 


# of Taxers 


RBNN {Sol verb) 


BPNN 


: of neurons m the 
hidden laver 


« of neurons in the 
output layer 


Error i*oui aimed tor 
during training 


'] 


U' 


JL 


IT 


-h 


60 


0.0 

j sc = 9.50 } 


0.00 1 

[sc = 6.50) 


0.001 


\v 


0.0 

[ sc = 9.45 ) 


0.001 

{sc = 8.35] 


0,001 


Table 7: Neural Network architectures used to design the models tor tj, H’ and l „ ot the supersonic 
turbine. ■; vc = spread constant |. 


Scheme 

a tor t] ( ( /r ) 

<7 for W( ( 7c) 

RBNN {Solverbe) 

1.251 

i .096 

RBNN ( Solverb ) 

r 0.292 

1.102 

BPNN 

0.777 

2.563 

Reduced Cubic RS 

1.031 

1.223 

fOr tiinprcnniP 


Table 8: 
turbine. 


Error are given in percentages of the mean value of the responses) 
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Scheme 

D 

RPM 

^ann 

c v 

Q, 

K, 



w 

^ pitch 

AN 2 

Apay 

RBNN 

(Solved je) 

0.972 

1.181 

0.81 1 

1.443 

0.836 

0.0 

0.810 

(5.80) 

0.636 

(0.74) 

1.148 

1.132 

-0.139 

(29.80) 

Meanline 

0.972 

1.181 

0.811 

1 .443 

0.836 

0.0 

0.766 

0.641 

1.148 

1.132 

-0.197 

RBNN 

(Solverb) 

0.999 

1.149 

0.857 

1.483 

0.792 

0.0 

0.785 

(1.75) 

0.653 

(0.17) 

1.148 

1.132 

-0.177 

(9.16) 

Meanline 

0.999 

1.149 

0.857 " 

1.483 

0.792 

0.0 

0.772 

0.654 

1.148 

1.132 

-0.194 

BPNN 

1.024 

1.121 ! 

0.901 

1.168 

1.143 

0.0 

0.793 

(2.49) 

0.608 

(8.63) 

1.148 

1.132 

-0.153 

(21.49) 

Meanline 

1.024 H 

1.121 

0.901 

1.168 

1.143 

0.0 

0.772 

0.666 

1.148 

1.132 

-0.195 

Reduced 
Cubic RS 

0.903 

1.272 

0.700 

1.706 

0.871 

0.0 

0.758 

(1.50) 

0.591 

(2.10) 

1.148 

1.132 

-0.194 

(8.40) 

Meanline 

0.903 

1.272 

0.700 | 1.706 

0.871 

0.0 

0.746 

0.604 

1.148 

l : /C- 

1.132 

-0.211 


Table 9: Optimal Solutions with constraints on V, mh and AN' for a supersonic turbine. (Error given in 
parenthesis for each prediction is in <x). (All variables are normalized by their respective baseline values) 
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I'ieure 2: (a) Radial basis network. 
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normalized by their respective baseline values) (Continued). 
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Figure 15: Effect due to presence (case 1) or lack of constraints (case 2) on objective functions, (a) 
Optimum Efficiency. I] O). (b) Optimum Weight. U’ (lbs), (c) Optimum pitch speed. V pllrh (in. /sec), (d) 
Optimum Annulus^Area X RPM. A.V' (in'*rpm‘) and (e) Optimum Incremental Payload. Apay (lbs) (All 
variables are normalized by their respective baseline values) (Continued). 
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figure 15: Effect due to presence tease I) or lack ol constraints tease 2) on objective functions. (a) 
Optimum Efficiency. 1 / (0). ib) Optimum Weight. \Y (lbs), to Optimum pitch speed. (in. /sec), (d) 
Optimum Annulus Area x RPM, A.V' tiiv : rpm") and < e > Optimum Incremental Payload. .\pav (lbs) (All 
enables are normalized bv (jieir respective baseline values). 



